Skip to content

Add remote signer GCO KMS#3182

Merged
alpe merged 3 commits intoalex/3163_kmsfrom
alex/3163_kms_gcp
Mar 20, 2026
Merged

Add remote signer GCO KMS#3182
alpe merged 3 commits intoalex/3163_kmsfrom
alex/3163_kms_gcp

Conversation

@alpe
Copy link
Contributor

@alpe alpe commented Mar 20, 2026

Overview

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9582d32e-25eb-44b4-af6c-6a050977497d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch alex/3163_kms_gcp
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedMar 20, 2026, 2:39 PM

@claude
Copy link
Contributor

claude bot commented Mar 20, 2026

Claude finished @alpe's task in 3m 39s —— View job


Code Review: Add remote signer GCP KMS

  • Gather context and understand the PR
  • Review pkg/signer/gcp/signer.go — core GCP KMS signer implementation
  • Review pkg/signer/gcp/signer_test.go — tests
  • Review pkg/signer/factory.go — signer factory integration
  • Review pkg/config/config.go — configuration changes
  • Review pkg/cmd/run_node.go — command wiring
  • Review test/e2e/evm_kms_e2e_test.go — E2E tests
  • Review documentation (gcp-kms-signer.md, aws-kms-signer.md)
  • Summarize findings

Overall Assessment

This is a solid, well-structured addition that closely mirrors the existing AWS KMS signer. The concurrency model, retry-with-backoff, and CRC integrity verification are good practices. The test coverage is thorough for the happy path and most error cases. A few issues worth addressing before merge:


Issues

🔴 High — Verify WithAuthCredentialsFile argument order

pkg/signer/gcp/signer.go:87

clientOpts = append(clientOpts, option.WithAuthCredentialsFile(option.ServiceAccount, opts.CredentialsFile))

option.WithAuthCredentialsFile is from the newer Google auth library. In google.golang.org/api v0.272.0, its signature is:

func WithAuthCredentialsFile(credentialsFile string, opts ...authopt.ClientOption) ClientOption

If that's the case, option.ServiceAccount (the string constant "serviceAccount") is being passed as the credentials file path, and opts.CredentialsFile (the actual path) is being passed as a variadic authopt.ClientOption — which won't compile since string doesn't satisfy that interface.

The build CI passes, which suggests either (a) the function signature in this version differs from what I'm reading, or (b) there's some implicit coercion. Either way, this is the single most important thing to manually verify. The conventional alternative is:

option.WithCredentialsFile(opts.CredentialsFile)

which is unambiguous and has been stable across all library versions. Fix this →


🟡 Medium — GCP client is never closed (resource leak)

pkg/signer/gcp/signer.go:40-51

cloudKMSClient wraps a *kms.KeyManagementClient which holds an open gRPC connection. KmsSigner has no Close() method, and the Signer interface doesn't require one. The connection will be leaked until GC collects it. For a long-lived node process this is acceptable (one connection per startup), but worth documenting or tracking.

If any path creates-and-discards a KmsSigner (e.g. key rotation, future dynamic reload), this becomes a real leak. Consider either:

  • Implementing io.Closer on KmsSigner and calling it from teardown, or
  • Adding a comment explicitly noting the connection is intentionally long-lived.

🟡 Medium — codes.Internal and codes.Aborted in retryable set

pkg/signer/gcp/signer.go:301

case codes.DeadlineExceeded, codes.Unavailable, codes.ResourceExhausted, codes.Aborted, codes.Internal:
    return true

codes.Internal represents unexpected server-side errors (bugs, panics) and is generally not safe to retry blindly — GCP's own AIP-194 recommends retrying only Unavailable and ResourceExhausted. codes.Aborted is similarly context-dependent. Retrying codes.Internal on a signing call (which should be idempotent) is safe in this specific case, but retrying it in general can amplify load on a misbehaving server. At minimum, consider logging a warning when retrying codes.Internal. Fix this →


🟡 Medium — Slow unit tests with real backoff delays

pkg/signer/gcp/signer_test.go:128, 169, 190, 221

TestSign_KMSFailure and similar tests using default MaxRetries: 3 incur real exponential backoff sleeps (100ms + 200ms + 400ms = ~700ms per test). These tests don't need real timing — they just count call attempts. Pass &Options{MaxRetries: 3, Timeout: 1 * time.Millisecond} (or a dedicated test-fast option) to keep the test suite snappy. Fix this →


🟡 Medium — No test for context cancellation during backoff

pkg/signer/gcp/signer.go:186-190

select {
case <-ctx.Done():
    return nil, fmt.Errorf("KMS Sign canceled: %w", ctx.Err())
case <-time.After(backoff):
}

This path is exercised in production when a caller cancels the context mid-retry, but it has no test coverage. A test with a cancelled context and a MaxRetries > 0 would cover this. Fix this →


🟢 Low — Dead code path in retryBackoff

pkg/signer/gcp/signer.go:235-237

if backoff > maxRetryBackoff {
    return maxRetryBackoff
}

The early return on lines 229–231 (if backoff >= maxRetryBackoff/2 { return maxRetryBackoff }) ensures backoff is always capped before it could exceed maxRetryBackoff. This final check is unreachable. Fix this →


🟢 Low — Redundant logging switch in run_node.go

pkg/cmd/run_node.go:140-149

switch nodeConfig.Signer.KMS.Provider {
case "aws":
    logger.Info().Msg("initialized AWS KMS signer via factory")
case "gcp":
    logger.Info().Msg("initialized GCP KMS signer via factory")
default:
    logger.Info().Str("provider", ...).Msg("initialized KMS signer via factory")
}

The first two cases can be collapsed into the default since they all log the same semantic event. This also makes it future-proof for new providers:

logger.Info().Str("provider", nodeConfig.Signer.KMS.Provider).Msg("initialized KMS signer via factory")

Fix this →


🟢 Low — Invalid gcloud IAM command in documentation

docs/guides/operations/gcp-kms-signer.md:50-53

gcloud projects add-iam-policy-binding <project> \
    --member "serviceAccount:kms-go-client@<project>.iam.gserviceaccount.com" \
    --role "roles/cloudkms.publicKeyViewer" \
    --role "roles/cloudkms.signerVerifier"

gcloud projects add-iam-policy-binding accepts only one --role per invocation. This command will silently ignore one role or error. It should be two separate commands. Fix this →


🟢 Low — Docs nav may need updating

The new gcp-kms-signer.md file was added, but it should be verified that .vitepress/config.ts includes it in the sidebar — otherwise the guide won't appear in the documentation site navigation.


Positives

  • CRC integrity verification on both request and response is exactly right and follows GCP's recommended best practices.
  • Eager public key fetch on construction is the right call — fail fast on misconfiguration rather than at signing time.
  • Interface-based KMSClient makes unit testing clean and avoids any real GCP calls.
  • kmsSignerFromClient vs NewKmsSigner separation is a clean pattern consistent with the AWS signer.
  • Config validation is thorough — required fields, positive timeouts, and non-negative retries are all checked.
  • E2E test structure (runEvmSequencerWithKMSSignerE2E) is well-factored and correctly skips when no credentials are provided.
  • The retry logic handles context cancellation correctly by checking ctx.Done() both in the error check and during the backoff sleep.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 76.52582% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.48%. Comparing base (5363cdb) to head (2f16b8f).
⚠️ Report is 3 commits behind head on alex/3163_kms.

Files with missing lines Patch % Lines
pkg/signer/gcp/signer.go 69.34% 29 Missing and 13 partials ⚠️
pkg/cmd/run_node.go 0.00% 8 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                @@
##           alex/3163_kms    #3182      +/-   ##
=================================================
+ Coverage          61.23%   61.48%   +0.25%     
=================================================
  Files                119      120       +1     
  Lines              12235    12421     +186     
=================================================
+ Hits                7492     7637     +145     
- Misses              3904     3933      +29     
- Partials             839      851      +12     
Flag Coverage Δ
combined 61.48% <76.52%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alpe alpe marked this pull request as ready for review March 20, 2026 14:42
@alpe alpe merged commit a4f7bf6 into alex/3163_kms Mar 20, 2026
20 of 24 checks passed
@alpe alpe deleted the alex/3163_kms_gcp branch March 20, 2026 14:42
@github-actions
Copy link
Contributor

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-03-20 14:44 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant